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Abstract 

This paper explores the two-user Gaussian interference channel through the lens of a nat- 
ural deterministic channel model. The main result is that the deterministic channel uniformly 
approximates the Gaussian channel, the capacity regions differing by a universal constant. The 
problem of finding the capacity of the Gaussian channel to within a constant error is therefore 
reduced to that of finding the capacity of the far simpler deterministic channel. Thus, the paper 
provides an alternative derivation of the recent constant gap capacity characterization of Etkin, 
Tse, and Wang [SJ. Additionally, the deterministic model gives significant insight towards the 
Gaussian channel. 



1 Introduction 
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One of the longest outstanding problems in multiuser information theory is the capacity region of 
the two-user Gaussian interference channel. This multiuser channel consists of two point-to-point 
Unks with additive white Gaussian noise, interfering with each other through crosstalk (Figure [U. 




Figure 1 : Two-user Gaussian interference channel. 

Each transmitter has an independent message intended only for the corresponding receiver. The 
capacity region of this channel is the set of all simultaneously achievable rate pairs (i?i, i?2) in the 
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two interfering links, and characterizes the fundamental tradeoff between the performance achiev- 
able in the links in the face of interference. Unfortunately, the problem of characterizing this region 
has been open for over thirty years. The capacity region is known in the strong interference case, 
where each receiver has a better reception of the other user's signal than the intended receiver [10,5]. 
The best known strategy for the other cases is due to Han and Kobayashi |10|. This strategy is a 
natural one and involves splitting the transmitted information of both users into two parts: private 
information to be decoded only at own receiver and common information that can be decoded at 
both receivers. By decoding the common information, part of the interference can be canceled off, 
while the remaining private information from the other user is treated as noise. The Han-Kobayashi 
strategy allows arbitrary splits of each user's transmit power into the private and common informa- 
tion portions as well as time sharing between multiple such splits. Unfortunately, the optimization 
among such myriads of possibilities is not well-understood, and it is also not clear how close to 
capacity can such a scheme get and whether there will be other strategies that can do significantly 
better. 

Significant progress on this problem has been made recently. In fSl, it was shown that a very 
simple Han-Kobayashi type scheme can in fact achieve rates within 1 bits/s/Hz of the capacity of 
the channel for all values of the channel parameters. That is, this scheme can achieve the rate 
pair (i?i — 1, i?2 — 1) for any (i?i, R2) in the interference channel capacity region. This result is 
particularly relevant in the high signal-to-noise ratio (SNR) regime, where the achievable rates are 
high and grow unbounded as the noise level goes to zero. The high SNR regime is the interference- 
limited scenario: when the noise is small, interference from one link will have a significant impact 
on the performance of the other. Progress has also been made towards finding the exact capacity 
region; by extending one of the converse ai^guments in [8|, the authors of |[T3| and fll show that 
treating interference as noise is sum-rate optimal when the interference is sufficiently weak. 

The purpose of the present paper is to show that the high SNR behavior of the Gaussian in- 
terference channel characterized in HI can in fact be fully captured by a natural underlying deter- 
ministic interference channel. This type of deterministic channel model was first proposed by ||2l 
in the analysis of Gaussian relay networks. Applying this model to the interference scenario, we 
show that the capacity of the resulting deterministic interference channel is the same — to within a 
constant number of bits — as the corresponding Gaussian interference channel. Combined with the 
capacity result for the two-user deterministic interference channel, the paper therefore provides an 
alternative derivation of the constant gap result of [8] (albeit with a larger gap). 

Because of the simplicity of the deterministic channel model, it provides a lot of insight to the 
structure of the various near-optimal schemes for the Gaussian interference channel in the different 
parameter ranges. Where certain approximate statements and intuitions can be made regarding the 
Gaussian interference channel, these statements are made precise in the deterministic setting. The 
near-optimality for the Gaussian channel of the simple Han-Kobayashi scheme as shown in [81 is 
made transparent in the deterministic channel: the derivation of the achievable strategy is completed 
in a series of steps, each shown to be without loss of optimality. As an added benefit, the relatively 
complicated genie-aided converse arguments are avoided. 

The close connection between the deterministic and Gaussian channels, as demonstrated in 
the example of the two-user interference channel discussed in this paper, suggests a new general 
approach to attack multiuser information theory problems. Given a Gaussian network, one can 
attempt to reduce the Gaussian problem to a deterministic one by proving a constant gap between 
the capacity regions of the two models. It then remains only to find the capacity of the presumably 
simpler deterministic channel. In [4J, the less direct approach of transferring proof techniques from 
the deterministic to Gaussian channel has been used successfully in approximating the capacity of 
the Gaussian many-to-one interference channel, where there is an arbitrary number of users but 



interference only happens at a single receiver. The approach used in ||4l is therefore taken a step 
further in this work. 

2 Generalized Degrees of Freedom and Deterministic Model for the 
MAC 

2.1 Generalized Degrees of Freedom 

Before the one-bit gap result [8], very little was known about the structure of the capacity region of 
the two-user Gaussian interference channel. The investigation of the generalized degrees of freedom, 
a concept introduced in ||8l, provided the first and crucial insight into the problem. In this section 
we motivate this idea through the MAC, as well as provide a more abstract look into what makes the 
generalized degrees of freedom so useful towards understanding the Gaussian interference channel. 
Let us start with the point-to-point AWGN channel. The output is equal to 



y = VSNRx + z , 
where z G CJ\f{0, 1) and the input satisfies an average power constraint 

^i:e[4i<i. 

fc=i 

The capacity is equal to 

C(SNR) =log(l + SNR). 

In an attempt to capture the rough behavior of the capacity, one may calculate the limit 

C(SNR) 
lim , ^ ^,,^ = 1 . (1) 

SNR^oologSNR 

The limit in ([B, the so-called degrees of freedom of the channel, measures how the capacity scales 
with SNR. The degrees of freedom is thus a rough measure of capacity, with unit equal to a single 
AWGN channel with appropriate SNR. 

We now attempt a similar understanding for the MAC. The channel output is 

y = hixi + /12X2 + zi 

where hi,h2 € C, Zj ~ CAA(0, 1), and each input satisfies an average power constraint 

1 ^ 

^.^nxy<p,, i = i,2. 
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The channel is parameterized by the signal-to-noise ratios SNRi = Pi|/iip and SNR2 = P2\h 
and we assume without loss of generality that SNRi > SNR2. The capacity region of the MAC is 
(see Figure [3]): 

Ri < log(l + SNRi) 

i?2<log(l + SNR2) (2) 

Ri + R2< log(l + SNRi + SNR2) . 




Figure 2: The classical degrees of freedom region for the MAC. 



Seeking simplification, a reasonable strategy is to attempt to compute a limit similar to ([T|l. 
However, there is not a clear choice of limit: the point-to-point channel had only one parameter and 
thus no ambiguity arose, but in the MAC there are two parameters SNRi and SNR2 and therefore 
many ways of taking limits. Let C(/ii, h2,P) denote the capacity region of the MAC ^ with 
channel gains hi, /12 and power constraint P for both users. One standard way of taking the limit of 
the region is to let the power constraint P tend to infinity, scaling by log P: 



lim 

P^oo 



C{hi,h2,P) 

logP 



Calculating the limit, one finds that the resulting region (see Figure |2ll 

di < 1 

d2 < 1 

di+d2<l 



(3) 



is altogether independent of the channel gains. More troubling, the limiting region Q is misleading 
from an operational viewpoint. The region seems to suggest that for high transmit powers, the 
optimal scheme is time-sharing between the two rate points in which only one user transmits at a 
time. But this is far from the truth, as a corner point of the capacity region has an arbitrarily greater 
sum-rate as channel parameters are varied, for each fixed power constraint. This limit, therefore, 
does not reveal any dynamic range between users, a quality that is relevant at finite SNR. 

A closer look at the capacity region itself leads to a different limit. Notice that the capacity 
region can be approximated to within one bit per user as (see Figure |3]) 



-Ri < log(l + SNRi) « logSNRi 
R2 < log(l + SNR2) w logSNRz 
Ri + R2< log(l + SNRi + SNR2) « log SNRi 



(4) 



In order to roughly preserve the shape of the capacity region in the limit, equation (|4]) suggests to 
fix the relationship between the two individual rate constraints, i.e. 

logSNR2 = a log SNRi. 

In other words, the ratio of SNRs is fixed in the dB scale. This is precisely the generalized degrees 

of freedom limit, 

C(SNR,SNR") 



2?(a) := lim 

SNR^oo 



log(SNR) 



log(l 



SNRa) 
igSNR2 
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Figure 3: The solid line shows the MAC capacity region. The dashed line shows the approximate 
region as given in (|4]|, and is within one bit per user of the capacity region. 
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Figure 4: The MAC generalized degrees of freedom region. The region is exactly the same as the 
approximate region in Figure [3l normalized by log SNRi. 



where C(SNRi, SNR2) denotes the capacity region of the MAC with signal-to-noise ratios SNRi, SNR2. 
The resulting region (Figure |4]) is 



di < 1 
d2 < a 
di + d2<l. 



(5) 



Qualitatively, the generalized degrees of freedom limit preserves the dynamic range feature of 
the finite-SNR channel. However, a more precise statement is true as well: because the approxima- 
tion to the region (01) is to within one bit, independent of the channel gains, it follows that the degrees 
of freedom region itself, when scaled by log SNRi, is within one bit of the true region. Thus, vary- 
ing a, the limiting regions ^ uniformly cover the entire collection of finite signal-to-noise ratio 
channels. In other words, to find the approximate capacity of any MAC with (finite) signal-to-noise 
ratios SNRi, SNR2, one simply needs to compute the generalized degrees of freedom limit for the 
value a = gj^. 

In the MAC, we observed that the generalized degrees of freedom limit correctly expresses the 
finite-SNR behavior. We now reflect on what properties, more abstractly, constitute a useful limit. 
Visually, a limit corresponds to a choice of path, (SNR, /(SNR)) in the (SNRi,SNR2) plane 
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Figure 5: An example limit path in the (SNRi, SNR2) plane. 
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Figure 6: The figure illustrates the notion of a limit region uniformly approximating the capacity 
region. Suppose the capacity, scaled by log SNRi, is constant along the limit paths. The dashed 
lines show several example limit paths. Then, to find the capacity region at any point (si, S2) in the 
(SNRi, SNR2) plane, one may simply follow the path (denoted by /) to the infinite arc, resulting 

inVif). 



(Figure[5]l. Thus, a first requirement is to choose a function / such that the Umit exists: 

C(SNR,/(SNR)) 
SNR^cxD logSNR ^■'^ 

Although many trajectories are possible, if the goal is a better understanding of the capacity 
region for finite power-to-noise ratios, some limit paths are better than others. Suppose, for example, 
that it was possible to choose / such that 

C(SNR,/(SNR)) 

^,,^ = constant! / ) (7) 

logSNR ^■'^ ^ ^ 

for the entire range SNR > 0. In words, the scaled capacity in ^ is constant along the path /. In 
this case, the problem of finding the limit (O is precisely the same as that of finding the capacity 
region for each point along the entire trajectory ! Moreover, if after computing the limit one could 
vary / so as to cover all points (SNR, INR), the problem of finding the capacity of the channel is 
completely solved. 

Figure [6] further explains this idea. We consider the scaled (by log SNR) capacity region. After 
taking a limit, one has the scaled capacity region at each point on an arc of infinite radius. Now, 
upon choosing an arbitrary point (si, S2) in the (SNRi, SNR2) plane, a good limit should allow to 
deduce, from the scaled capacities on the infinite-radius arc, the (approximate) scaled capacity at 
(si, S2). Hence the significance of (|7]l, which allows to equate the scaled capacity at finite SNRs 
with the limiting regions: if condition ^ is satisfied, one may simply choose the path / containing 
the point (si, S2), which gives 

C{su S2) =C{si,f{si)) = log Si-V{f). 

For the MAC, the set of trajectories defining the generalized degrees of freedom limit satisfies 
^ to within a universal constant, independent of SNR. The generalized degrees of freedom of the 
Gaussian MAC © is the limit Q along the path 

f{s) = s^ . 

The generahzed degrees of freedom of the MAC is intimately connected to, and captured by, a cer- 
tain deterministic channel model. In fact, the capacity region of the deterministic channel is, when 
properly scaled, equal to the generalized degrees of freedom region. Equivalently, the deterministic 
channel satisfies dTJl exactly. 

2.2 Deterministic Channel 

In this section we introduce a deterministic channel model analogous to the Gaussian channel. This 
channel was first introduced in IH. We begin by describing the deterministic channel model for the 
point-to-point AWGN channel, and then the two-user multiple-access channel. After understanding 
these examples, we present the deterministic interference channel. 

Consider first the model for the point-to-point channel (see Figure IT]). The real- valued channel 
input is written in base 2; the signal — a vector of bits — is interpreted as occupying a succession of 
levels: 

X = 0.6162636465 .... 

The most significant bit coincides with the highest level, the least significant bit with the lowest 
level. The levels attempt to capture the notion of signal scale; a level corresponds to a unit of power 
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Figure 7: The deterministic model for the point-to-point Gaussian channel. Each bit of the input 
occupies a signal level. Bits of lower significance are lost due to noise. 



in the Gaussian channel, measured on the dB scale. Noise is modeled in the deterministic channel 
by truncation. Bits of smaller order than the noise are lost. The channel may be written as 

y = L2"xj , 

with the correspondence n = [log SNRJ . 

The deterministic multiple-access channel is constructed similarly to the point-to-point channel 
(Figure [8]l, with ni and n2 bits received above the noise level from users 1 and 2, respectively. To 
model the superposition of signals at the receiver, the bits received on each level are added modulo 
two. Addition modulo two, rather than normal integer addition, is chosen to make the model more 
tractable. As a result, the levels do not interact with one another. 

If the inputs Xi{t) are written in binary, the channel output can be written as 



2/=L2"^xiJeL2"^X2j 



(8) 



where addition is performed on each bit (modulo two) and [ • J is the integer-part function. The 
channel can be written in an alternative form, which we will not use in the present paper but leads to 
a slightly different interpretation. The input and output are xi, X2,y G F2, where q = max(ni, 712). 
The signal from transmitter i is scaled by a nonnegative integer gain 2"' (equivalently, the input 
column vector is shifted up by rij). The channel output is given by 



jg-ni 



Xi 



;ij-n2 



X2, 



(9) 



where summation and multiplication are in F2 and Sis a.q x q shift matrix. 
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(10) 
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The capacity region of the deterministic MAC is 

ri < ni 
r2 < n2 
fi+r2< max(ni , 71-2) • 

Comparing with ^, we make the correspondence 

ni = [log SNRiJ , n2 = [log SNR2J . 



(11) 
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Figure 8: The deterministic model for the Gaussian multiple-access channel. Incoming bits on the 
same level are added modulo two at the receiver. 

Evidently, the capacity region of the deterministic MAC is constant when normalized by ni and the 
ratio a = ^ is held fixed. Thus, the deterministic MAC satisfies (O exactly when the gains are 
integer-valued; the normalized capacity along any point in the limit path is equal to the degrees of 
freedom of the deterministic MAC, which is in turn equal to the degrees of freedom of the Gaussian 
MAC. 



3 Deterministic Interference Channel 

In Section [2] we motivated the generalized degrees of freedom limit and saw how it led to a simple 
deterministic model. The generalized degrees of freedom, and the equivalent deterministic model, 
was seen to uniformly approximate the MAC. With this success in explaining the MAC, a logical 
next step is to apply the deterministic model to the Gaussian interference channel. 
The Gaussian interference channel is given by 

yi = hiixi +hi2X2 + zi 

y2 = h2lXi + /122X2 + Z2 , 

where zt ~ CM{0, 1) and the channel inputs satisfy an average power constraint 



1 



N 

fc=l 



2 1 
i,ki 



<Pi. 



1,2, 



The channel is parameterized by the power-to-noise ratios SNRi = |/iiipPi, SNR2 = |/i22p-P2. 

INRi = |/l21pPi, INR2 = |/ll2pP2. 

We proceed with the deterministic interference channel model (Figure |9ll. Note that the model is 
completely determined by the model for the MAC. There are two transmitter-receiver pairs (links), 
and as in the Gaussian case, each transmitter wants to communicate only with its corresponding 
receiver. The signal from transmitter j, as observed at receiver i, is scaled by a nonnegative integer 
gain 2"'^ (equivalently, the input column vector is shifted up by Uij). At each time t, the input and 
output, respectively, at link i are Xi{t),yi(t) G Fg, where q = maxjj riij. 

The channel output at receiver i is given by 



Viit) 



iq-nn 



xi(t)es''-"'2x2(t), 



(12) 
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Figure 9: At left is a deterministic interference channel. The more compact figure at right shows 
only the signals as observed at the receivers. 

where summation and multiplication are in F2 and S is defined in (ITOl ). 

If the inputs Xj are written in binary, the channel can equivalently be written as 



ya = L2"''2;iJ © [2^22x2] 



where addition is performed on each bit (modulo two) and [ • J is the integer-part function. We will 
use the latter representation in this paper. 

In the analysis of the deterministic interference channel, it will be helpful to consult a different 
style of figure. The left-hand side of Figure |9] depicts a deterministic interference channel, and the 
right-hand side shows only the perspective of each receiver. Each incoming signal is shown as a 
column vector, with the highest element corresponding to the most significant bit and the portion 
below the noise level truncated. The observed signal at each receiver is the modulo 2 sum of the 
elements on each level. In the sequel, the dashed lines indicating the position of each entry of the 
vector will be omitted. 

Just as in the discussion of the MAC, the deterministic interference channel uniformly approxi- 
mates the Gaussian channel. In finding the capacity of the Gaussian interference channel to within 
a constant number of bits, it therefore suffices to find the capacity of the far simpler deterministic 
channel. 

Theorem 1. The capacity of the two-user Gaussian interference channel with signal and interfer- 
ence to noise ratios SNRi, SNR2, INRi, INR2 is within 42 bits per user of the capacity of a deter- 
ministic interference channel with gains 2"" = 2Li°gSA/RiJ^ 2^12 = 2Li°g"^''2j^ 2^21 = 2Li°g"VRiJ_ 
andT^-^-^ = 2LiogS'VR2j_ 

Proof. The capacity of the two-user Gaussian interference channel has been characterized to within 
one bit by Etkin, Tse, and Wang [8|; thus, we could prove the theorem by following the approach 
used for the MAC in Section |2l comparing the capacity regions of the deterministic and Gaussian 
channels. We instead choose to prove the Theorem with no a priori knowledge of the result for the 
Gaussian channel. This approach provides insight into the deep connection between the determin- 
istic and Gaussian channels, and also gives an alternative derivation of the constant-bit characteri- 
zation of m (but with a significantly larger gap). The proof is deferred to the appendix. D 

Theorem [T] gives as a corollary that the generalized degrees of freedom of the two-user Gaussian 
interference channel is exactly equal to the scaled capacity of the corresponding deterministic chan- 
nel. This explains why the degrees of freedom limit characterizes, up to a constant, the capacity of 
the Gaussian channel. 
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4 Structure of Optimal Strategy for Deterministic Channel 

El Gamal and Costa's characterization of the capacity region for a class of deterministic interference 
channels [9] applies to this particular deterministic channel. Moreover, it is not difficult to determine 
the optimal input distribution from their expression. But it is not immediately apparent why this 
region is in fact optimal. 

The goal of this section is to derive from the beginning, using only the most basic tools of infor- 
mation theory, the (arguably) natural optimal achievable strategy. Although the resulting strategy 
coincides with a specific Han and Kobayashi strategy, by proceeding in this way we hope to de- 
mystify the structure of the achievable strategy. In particular, we will see how common and private 
messages arise inevitably, quickly giving the capacity region of the channel. It is noteworthy that 
no separate outer bounds are required. Thus, the intuitive appeal of this approach is bolstered by it 
not requiring the side-information converse proofs of IH and ||9|. 

The natural decomposition of messages into common and private parts was motivated at an 
intuitive level for the Gaussian interference channel in Sections 6 and 7 of |8|. In the setting of the 
deterministic channel, the arguments of this section make those ideas precise. 

The following standard definitions and notation will be used. Denote by A4i = {!,..., Mi} 
and A^2 = {1, • • • > M2} the message sets of users 1 and 2. Let the encoding functions /j : Mi — > 
Xi with fi{j) = Xi{j) map the message j generated at user i into the length N codeword Xi{j). Let 
the decoding functions gi{yi) map the received signal yi to the message j if yi G Dij, where Dij is 
the decoding set of message j for user i. An {N, Mi, M2, /i) code consists of Mj codewords Xi{j) 
and Mi decoding sets Dij such that the average probability of decoding error satisfies 



l^Y.P{Di,\x,{j),X2{k))>l-fi., 



M1M2 



jk 



j^^E^iD,,\x,ijlx,ik))>l-^. 

jk 

A pair of nonnegative real numbers (ri,r2) is called an achievable rate for the deterministic in- 
terference channel if for any e>0, 0</x<l, and for any sufficiently large N, there exists an 

(iV, Ml, M2, n) code such that 

— log Mj > rj — e . 

The first lemma is a simple analogue of Shannon's point-to-point channel coding theorem, stat- 
ing that the mutual information between input and output determines the capacity region. 

Lemma 1. The rate point (ri, r2) is achievable if and only if for every e > there exists a block 
length n and a factorized joint distribution p{x^)p{x2) with 

r-i-e<l/(xf;yf) 

1 (13) 

r2-e<Pix^;y^). 

Proof. Fix a block length N and joint distribution p(x{^)p(x^). Each user i = 1,2 will use the 
distribution over p{xf) as an inner code, using k blocks of length A^. The codebooks are constructed 
using random coding, and the achievability of (ri, r2) follows by the random coding argument (with 
joint typicality decoding) for the point-to-point discrete memoryless channel. 



11 





Rxi 










RX2 








n22 






nil 






Xlc 




ni2 


"21 






X2c 














Xlp 




X2c 






Xlc 




X2p 


1 




2 




1 




2 



Figure 10: The figure depicts tlie received signal at each receiver. Notice that the private signals (as 
defined in Lemma|2]l, xip, X2p, are not observed at the other receiver. 

As in the point-to-point case, the converse is a straightforward application of Fano's inequality: 



nn = H{W,) = H{Wi\yf) + I{W,; yf) 
<l + Pi^)nr, + /(xf;2/f), z = l,2. 



It is assumed that Pf 
desired result. 



(iV) 



as A^ ^ cxD. Dividing by A^ and taking A^ sufficiently large gives the 

D 



The next two lemmas are the most important of this section; they show the optimality of sepa- 
rating each message into a private and common message (the terms common and private are to be 
justified later, and for now to be regarded simply as labels). 

Lemma 2. Given any achievable rate point (ri, r2), this rate-point is achievable using a code with 

the following decomposition. 

1. The channel inputs, x^ and xi^, are separated into components consisting of common and 

private information: 

N / N N\ N / N N\ 

/y / ry rf \ /y / /y rn \ 

Xl — V-^lpi-i-lc/) X2 —{X2p,X2c)- 



2. The message sets are separated into private and common messages, i.e. Mi = Mic x Aiip 
for users i = 1,2, with the common signal xf^ = f'[{mic) a function only of the common message 
rriic G Mic cmd the private signal xf = ff{mip, rriic) a function of both the private and common 



tp: 



mi 



G Mip X Mic. 



message {m. 

3. The common rate is less than the entropy of the common signal, that is r? < j^H{xf^). 

Proof. Consider an achievable rate point (ri,r2). The proof follows by converting an arbitrary 
achievable strategy to one that satisfies the desired properties. Fix e > 0, a block length A^', and an 
arbitrary distribution p{xi )p{x2 ) such that (fT3l) is satisfied with e/2. Write the input as x^ = 
{xf ,x^ ), where x^ is the input x^ restricted to the lowest (nn — n2i)^ levels, x^ is the 
restriction to the highest n2i levels, and similarly for x^ , x^ (see Figure [TOl). Note that if n2i > 

nil ("-12 > ^^22) then the private signal x^ (x^ ) is empty. 

It must now be verified that transmitter i can separate the message set Aii into the direct product 
of two message sets Mip x Mic. The scheme uses a superposition code, as used for the degraded 
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Figure 11: Lemma|2] shows that we may view the common signal and private signal of each user as 
coming from two separate users, with the private user having access to the signal from the common 
user. 

broadcast channel (see e.g. Q), with Xic serving as the cloud centers and Xip as the clouds. To see 
that this is possible, put for i = 1, 2, 



N' 1 ill ) A 

' ~ N' ''' 4 



(14) 



Then from the chain rule we have 

I ic ^ I ip — j^fJ- \-^ic T tli ) A ]\Ji ^ "i-P ' ^* l-^lc 7 A 

^ Tt N' N'\ ^ \ 

= j^^y^i '^Vi ) - 2 - ^» ~ ^ ■ 

For some sufficiently large super-block length k, generate for i = 1,2, 2^^ ^^^ independent code- 
words of length N'k, x^^ {rriic) according to Y[t=i Pi^ict)- The block-length N in the statement of 
the lemma is given by A^ = N'k. Now, for each codeword x^^ {iriic), generate 2^^ ^^p codewords 
of length N'k, x'l^ {mic,mip), according to the conditional distribution Y[t=i Pi^tp tl^fc ti''^ic)) ■ 
Decoding is accomplished using joint typicality, and the probability of error may be taken as small 
as desired by choosing k large. Since e was arbitrary, this proves the lemma. D 

The previous lemma shows that we may consider the deterministic interference channel as a 
channel with four senders and two decoders, as in Figure [TT] This interpretation motivates the next 
lemma, which shows that each user is able to decode the common information of the interfering 
user. The lemma makes use of facts concerning the multiple access channel. For background on 
the multiple access channel see e.g. ifTTH Tll. The lemma can essentially be deduced from the result 
by Costa and El Gamal on discrete memory less interference channels with strong interference 161. 
The result itself is analogous to Sato's result for the Gaussian interference channel in the strong 
interference regime [ 12 J ; however, because Lemma |2] shows that the signal ought to be separated 
into common and private components, the argument applies to the entire parameter range. 

By the MAC at receiver 1 we mean the MAC formed by the two users transmitting signals 
(xip, xic) and X2c at rates r^ + rf and r2, respectively, with receiver 1 required to reliably decode 
both signals, and similarly for the MAC at receiver 2. 
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Lemma 3. The region is exactly described by the compound MAC formed by the MAC at each of the 
two receivers, along with constraints on the private rate. Furthermore, the region has a single-letter 
representation. 

Proof. Suppose the rate-point (ri, r2) is achievable. By Lemma|2j we may assume that each user's 
common signal is a function only of the common message, and that 

Then each user, upon successfully decoding their own signal and subtracting it off, has a clear view 
of the other user's common signal xf^. But, since the common rate is smaller than the entropy of 
the common signal (|T5] ). it is possible to recover the common message ?tt,jc with arbitrarily small 
probability of error when N is taken to be large enough; in other words, each user can reliably 
decode the other user's common message. 
The joint distribution of the channel is 

The fact that each receiver can decode the common message of the other user implies, by Fano's 
inequality, that 

—H{'mic,mip,m2c\yi) -^ 



and 



—H{mic,m2p,m2c\y2) ^ 



as A^ — > CO. 

Proceeding as in the converse argument for the MAC (see e.g. [7|), one can show that for any 
joint distribution (|T6l ) the rate point (rf , r^,r2, rf) satisfies a number of constraints. First, the rate 
point (rj + r^, rg) must lie within the MAC at receiver 1 and the rate point (rj, r2 + rg) must lie 
within the MAC at receiver 2. Additionally, there are constraints on the private rates r^, r2 and the 
rates r^+r2 and rf+rj. More precisely, there exists a distribution p(xip|xic, q)p{xic\q)p{x2p\x2c, q)p{x2c\q)p{q) 
such that 

ri+r2 = rl+r'^ + r2< I{xic, xip, X2c\ y\\Q) 

r\ = r\^r\< /(xic, xxp\ y\\x2c, Q) 

rl < I{x2c]yi\xic,xip,Q) 

rl + r^K I{xip, X2C-, y\\x\c, Q) 

rl < Iixip;yi\x2c,xic,Q) 

rj + r2 = rj + rf + r^ < /(x2c, X2p, xic] y2\Q) 

r2 = rl + rl< I{x2p,X2c]y2\xic,Q) 

rl < I{xic;y2\x2c,x2p,Q) 

rl + rl< I{x2p,xic;y2\x2c,Q) 

rf < I{x2p;y2\xic,X2c,Q) ■ 

Conversely, if the rate tuple {r^ + r^,r2) is within the MAC at receiver 1, and (rf , rg + rg) is 
within the MAC at receiver 2, and the additional constraints on r^, rg are satisfied, then the rate point 
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Figure 12: From the figure it is possible to understand the constraints dTSl ) as areas of rectangles. 



(rj, Tj*, Tg, r2) is achievable using a superposition random code as in Lemma[2] and joint typicality 
decoding. D 

The next lemma makes the region in equation dTT] ) explicit. 

Lemma 4. Observe that the optimizing (simultaneously for each of the constraints in (1171 )) input 
distribution is uniform for each signal. This allows us to write the region as 

r{ + r\ + r2< nu + min(n22, ("-12 - nn)^) 
r\ + r^ < nil 

r2 < min(ni2,n22) 
rl + r^< min(n22 + {nn - n2i)+, ni2) 



(18) 



^1 < nil - f^2l 
r2 + rf + rj < 7122 + min(nii, (n2i - "22)''') 

^2 + rf < 7122 

rl < min(n2i,nii) 
r^ + rl < min(nii + (n22 - "12)^, n2i) 
r2 < n22 - ni2 . 

Proof Intuitively, the private signal should be uniform because it helps the intended receiver decode 
and does not cause interference, and the common signal should be uniform because it helps both 
receivers decode. 

Fix a joint distribution and consider a rate point satisfying the constraints of the previous lemma. 
From the equations of the previous lemma, it is easy to see that p{xip) should be uniform in any 
optimal distribution, since this increases the mutual information terms where Xip appears. Similarly, 
p{xic) should be uniform. This allows to evaluate the mutual information expressions in equation 
(ITT] ). resulting in the stated region. D 

Remark 1. The constraints of Lemma^ admit a simple interpretation in terms of the areas of the 
relevant rectangles in Figure [72] 



The constraints (1181) determine the capacity region of the deterministic channel; using Fourier- 
Motzkin elimination one can solve for the region in terms of constraints on ri and r2. Alternatively, 
note that the deterministic interference channel of this paper falls within the class of more general 
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Figure 13: The sum-rate capacity of the deterministic interference channel, normalized by n. The 
dotted hne continuing downwards from the point (1/2, 1/2) is the rate achievable by treating inter- 
ference as noise. 

deterministic channels whose capacity is given in Theorem 1 of 191 ■ Applying this theorem, the 
deterministic channel capacity region is the set of nonnegative rates satisfying 

ri<nii, i = l,2 
ri + r2 < (nil — "■12)'^ + max(n22, ni2) 
ri + r2 < (n22 - n2i)+ + max(nii,n2i) 
ri + r2 < max(n2i, (nn - ni2)"'") + max(ni2, (^22 - "-21)"'") 
2ri + r2 < max(nii, n2i) + (nn - 7212)^ + max(ni2, (n22 - ^21)"^) 
n + 2r2 < max(n22,ni2) + (n22 - "-21)^ +max(n2i, (nn - ^12)"'') . 

5 Examples 

It is instructive to consider a few examples of capacity-achieving schemes for the deterministic 
channel. For simplicity, we restrict attention to the symmetric case, i.e. n := nu = 7122 and 
7121 = ni2 = na, where a := ^^iia.. Most of the achievable schemes presented admit simple 
interpretations in the Gaussian channel. Figure [13] depicts the sum-rate capacity of the symmetric 
channel, indexed by a. 

Consider first the case a = 1/3. One option is to use the strategy described in Section IH 
making the entire signal private information (Figure [141). In the deterministic model the signal does 
not appeal" at the unintended receiver. This corresponds to transmitting below the noise level in the 
Gaussian channel, in which case the additional noise from the interference causes a loss of only one 
bit for each user. A second option is for each transmitter to use the full available power, transmitting 
on the highest 2/3 of the levels (Figure [Tsll. The lower 1/3 of the levels are unusable on the direct 
link due to the presence of interference. This strategy corresponds to treating interference as noise 
in the Gaussian channel. The value a = 1/3 is representative of the entire range a G [0, ^], where 
both of these strategies are optimal. 

For Q = 2/3 there are again a few options. One possibility is to use the capacity achieving 
scheme of Section [H with the lowest 1/3 of the levels consisting of private information, and the 
remaining 2/3 of the levels as common information (see Figure [T6l). The rate achieved is ri = r2 = 
2ra/3 bits per channel use per user. Alternatively, imagine continuously varying a from the value 
a = 1/3 to a = 2/3, while using the scheme of treating interference as noise (Figure [TSll. The used 
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Figure 14: a = 1/3. Two-thirds of the signal is private information, with no common information. 
This scheme corresponds to transmitting below the noise level. 
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Figure 15: a = 1/3. The top third of the levels are common information, and the middle third are 
private information. This scheme corresponds to treating interference as noise. 

power range will shrink to the range between 2n/3 and n. However, a gap appears, and the range of 
levels between 1 and n/3 can be used as well (Figure [17]). The gap in the corresponding Gaussian 
setting is because of the structure of the interference: the interference contains information, and can 
be decoded. After decoding the interference it can be subtracted off, and additional information 
can be transmitted. This phenomenon is the reason why treating interference as noise is no longer 
optimal beyond a = 1/2. 

The case a = 3L/4 is different than the previous examples: here coding is necessary. The 
random code of Section |4] has the lowest 1/4 of the levels containing private information and the 
highest 3/4 of the levels contain common information (Figure [TSl l. The symmetric rate achieved 
is 5n/8 bits per channel use per user. As in the previous examples, using only one time-slot is 
possible, but for a > 2/3, using one time-slot requires coding over levels. The scheme in IS, 
shown in Figure [191 achieves the rate point (3n/4, n/2) by repeating a symbol on two different 
levels; the symmetric point (5n/8, 5n/8) is achieved by time-sharing. 

Appendix: Proof of Deterministic Approximation Theorem 

In this appendix we prove Theorem [T] which states that the capacity region of the 2-user Gaus- 
sian interference channel is within 42 bits per user of the deterministic interference channel. More 
specifically, for each choice of channel parameters in the Gaussian channel, the corresponding de- 
terministic channel has approximately the same capacity region. The focus is not on optimizing the 
size of the gap; several of the estimates are weakened in favor of a simpler argument. Rather, the 
significance is that the gap is constant, independent of the channel gains. Moreover, the proof uses 
no knowledge of the Gaussian channel. Thus, the approach used here, along with the deterministic 
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Figure 16: a = 2/3. One-third of the signal is private information, and two-thirds is common 



information, but the common rate equals the private rate: r^ 



n/3. 
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Figure 17: a = 2/3. As a is increased from 1/3 to 2/3, a gap appears in the bottom 1/3 of the 
levels. This gap can be used to transmit private information. 



Rxi 



Rxo 



1 


n 






X2c 


Xlp 



3n 

4 



3n 






1 


4 


i 

r 






X2p 



n 



Figure 18: q = 3/4. This scheme is essentially the same as in Figure[T6] One-quarter of the signal is 



private information and three-quarters is common information. The common rate is r J 
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Figure 19: a = 3/4. Coding over levels is^rformed by repeating the vector of bits hi. 



capacity region from Section |4l gives an alternative derivation of the constant gap capacity result of 
Etkin, Tse, and Wang m. 

We first prove Theorem[2l which is the same as Theorem[T]but for the real Gaussian interference 
channel, where the inputs, channel gains, and noise are real-valued. The complex-valued case is 
discussed afterwards. The main ingredients used in the proof of Theorem[T]for the complex-valued 
channel are the same as those introduced in the proof of the real- valued channel. 

Theorem 2. The capacity of the real- valued 2-user Gaussian interference channel with signal and 
interference to noise ratios SNRi, SNR2, INRi, INR2 is within 18.6 bits per user of the capacity of 
a deterministic interference channel with gains 2"" := 2L5l°sS'^"iJ, 2"i2 := 2^^2^°^"^"^'^, 2"2i — 
2L|iog/A/ffiJ^ a«<i 2"22 := 2L5i°gSA/R2j_ 

The factor of ^ in front of the logarithm is due to the channel being real-valued. 
Recall that the real-valued Gaussian interference channel is given by 

yi = hiixi + hi2X2 + zi 

2/2 = h2lXi + h22X2 + Z2 

where zi ~ AA(0, 1), hij G R, and the input signals xi, X2 satisfy an average power constraint 

fc=i 

By scaling the channel gains, we may assume without loss of generality that the average power 
constraints of the Gaussian channel are equal to 1, i.e. Pi = P2 = 1. 
The corresponding deterministic channel, introduced in Section |3l is 



yi = L2""xiJ © L2"^'X2J 



(20) 



where riij = [log \hij\\ and Xi,i = 1, 2 are real numbers, < Xj < 1. Addition is modulo 2 in 
each position in the binary expansion. 

The proof of Theorem |2]requires two directions, namely 

Ccaussian Q Cdet + COUStaUt 

and 

Cdet ^ Caaussian + Constant . 

Each direction will be completed in a sequence of steps, each step comparing the capacity region 
of a new channel to that of the previous step. The first and last channels will be the Gaussian and 
deterministic channels under our consideration. 

Al. Cdet C Ccaussian + (5, 5) 

We now show that the capacity achieving input of the deterministic channel (l20l ) can be transferred 
over to the Gaussian channel (fT9l ) with a loss of at most 5 bits per user. This specifies an achievable 
region for the Gaussian channel. As mentioned above, the argument is based on comparing mutual 
information in a sequence of steps. 

The first step shows that the capacity region does not decrease if the modulo 2 addition of 
the deterministic channel is replaced by real addition; Step 2 shows that the capacity region of 
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the deterministic channel (I20b is the same if the gain 2"^*^ is replaced by a real-valued hij with 
riij = [log \hij\\ ; Step 3 adds Gaussian noise; Step 4 removes the truncation of received signals at 
the noise level. 

The following easy lemma bounds the effect of a change to the channel output when the original 
output can be restored using a small amount of side information, and will be used several times. 

Lemma 5. Fix a block-length N. If the signal y^ is determined by the pair y^ , s^, then 
Proof. The assumption that y^ is determined by y^ , s^ implies that 

This inequality together with the chain rule gives 

>H{x^)-H{x^,s^\y'') 
>Hix^)-His^)-H{x''\y^) 

This proves the lemma. D 

Step 1: Real addition (lose zero bits). For simplicity, only the output yi is discussed. The corre- 
sponding statements for y2 follow similarly. 
We may write the inputs as 



oo 
Xi — 

fc=l 



Y,x^ik)2-\ Xi{k)e {0,1}. (21) 

fc=l 

In the deterministic channel ( [201 ). we have 

oo nil 

[2"" ^xi(/fc)2-'^J = ^2"""'^xi(A;) 



k=l k=l 

and 

oo ni2 



L2"i2 ^X2{k)2-''\ = ^2"i2-'=x2(A;) 



k=l fc=l 

Thus, the common signal from user 2 is 

X2c = {3:2(1),... ,X2(ni2)}. 

Step 1 replaces the modulo 2 addition of the deterministic channel with real addition. Using the 
two previous equations, we define (the output at receiver 1 of) Channel 1 as 

nil ni2 



yi=Y, 2""-'^xi(A;) + Y,r'''-''x2{k) . 



fc=i fc=i 
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We claim that the capacity region of this new channel contains that of the original deterministic 
channel. Any rate point within the region ( fTTl ) given by Lemma[3]is achievable for Channel 1: 



rl + rl + r^< I{xic, xip, X2c] Vi) 
rl + r'{< I{xic, xip; yi\x2c) ■■ 

rl < I{x2c;yi\xic,xip) -- 
rl + rl< I{xip, X2c\ yi\xic) ■■ 

rl < I{xip]yi\x2c,xic) ■■ 

< I{x2c,X2p,Xic;y2) 
rl + rl< I{x2p, X2c] y2\xic) ■■ 
Ti < I{xic]y2\x2c,X2p) ■■ 

rl + rl< I{x2p, xic] y2\x2c) ■■ 

rf < I{x2p]y2\xic,X2c) ■■ 



r^+r?,+ r 



- H{y^) 
H{xi) 

H{x2c) 

H{yi\xic) 
H{xip) 

- H{y2) 

H{X2) 

H{xi,) 
H{y2\x2c) 

H{x2p) . 



(22) 



Thus, it suffices to show that each of the mutual information constraints is made looser when using 
the (optimal) uniform input distribution of the deterministic channel. Note that only the first, fourth, 
sixth, and ninth constraints are affected by the change to real addition. 

Now, in the deterministic channel (l20l ). the output yi is uniformly distributed; alternatively, 
each bit in the binary expansion of yi that is random is independent of the other bits and has equal 
probability of being zero or one. The distribution of these bits in the binary expansion of yi does 
not change in passing to real addition, because each bit is the sum modulo two of a carry bit and a 
fresh random bit. It follows that the entropy H{yi) does not decrease. The entropies H{yi\xic) and 
H{y2\x2c) behave similarly. 

Step 2: Real-valued gains (lose log 3 bits). In this step we compai^e the achievable rate under a 
uniform input distribution of a channel with real- valued gains to the achievable rate in Step 1, losing 
at most log 3 bits per user. The result is an achievable region that is within log 3 bits per user of the 
capacity region of the original deterministic channel. 

To allow real-valued gains, we first allow negative cross gains. It is sufficient to consider only 
the case of cross gains, rather than any of the gains, being negative, since each transmitter can 
negate its input to ensure a positive signal on the direct link. Viewing each input as coming from 
a contiguous subset of integers in the real line, it is clear that the entropy constraints in (|22l) are 
invariant to negating a cross gain when the distribution is uniform. 

Next, replace 2"'J with the gain hij having binary expansion 



hi 



Accordingly, Channel 2 is given by 



sign(/iij) V 2 ^hij{k). 



yi 



-n,: 



\ / \ 

^ 2-%i(fc) j;2-^xi(A;) 

fc=-nii ^ ^fc=l ^ 

5] 2-''h^2ik)][^2->^X2{k)] 



(23) 



-Fsign(/ii2; 



and analogously for 7/2- We continue by comparing the mutual information constraints in 
noting that any rate in the intersection of the MACs at each receiver is achievable by coding for 
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Figure 20: Making the gains real-valued creates gaps in the support without changing its cardinality. 
In this example n = 3 and /i = 1.4(2^) = 11.2. 



the MACs. To begin, we may view the first term in (1231 ) as starting with the random variable 



2"" ^.1=1 '^'~^xi{k), which is uniformly distributed on {0, ... , 2"" 



1}, scaled by IJJ5- > 1, 



and retaining the integer part [ • J . Upon scaling, any two points in the support are at least distance 
1 apart, so the integer part is at least distance 1 as well. Thus, the first term in ( |23l ) is uniformly 
distributed with support a subset of the integers having cardinality 2""; the support now has gaps, 
and is no longer the set of integers between and 2"" — 1 (see Figure l20l). 

The second term in (l23l ) is similar, but the the argument must be modified to account for the part 
of the signal below the noise level. We have 



(00 
Y. 2-'hu{k) 
1.— _^._ 



-"12 

"12 



00 

E 

fc=i 



2-''x2{k) 



/ii2|^2-'=X2(A:) + |/ii2| Yl 2-'^^2(A:) 

k=l k=ni2+l 

[Ai + A2\ 



(24) 



The argument for the first term in ( [231 ) applies to the sum Ai in (|241 ). giving that Ai is distributed 
uniformly with spacing hi2/2^^'^ > 1 and support set having cardinality 2"^^ Now, A2 is bounded 
as < A2 < 2, since {hul < 2"i2+i. Hence, defining 



s=lAi+ A2J - 
we see that s can take on values 0, 1, 2, giving 

H{s) < logs 
Neglecting A2, let the modified output be 



L^iJ, 



(25) 



2/1 



00 

E 

k=—nii 



2-^hu{k) 



+ sign(/ii2) 



"11 \ 

Y2-'x,ik)] 

k=l ^ 

00 X ^ ni2 

Y 2''h,2ik)](Y^-'x2ik) 
=-ni2 ^ ^k=l 



(26) 
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Since yi can be recovered by the pair yi, s, Lemma |5] shows that 

H^iiyi) > i{xi;yi) - logs. 

The argument is completed by using the fact that 



H 



nil 



|/Hi|J^2-'=xi(fe) 



fc=i 



"12 



|/ii2|^2-'=X2(A:) 



k=l 



nil 



ni2 



>i/^2""-^xi(A;)+^2"i2-fca:2(A;) . 



\k=l 



k=l 



This is seen to be true by directly comparing the distributions of the two random variables within 
the entropies. Counting the number of pairs of integers that sum to each integer, we see that the 
distribution on the left-hand side can be achieved by shifting probability mass from more likely to 
less likely values. 

The argument applies to all the mutual information constraints of (l22l ). Step 2 incurs a loss of 
logs < 1.6 bits. 

Step 3: Additive Gaussian noise (lose 1.5 bits). Let Channel 3 be obtained from Channel 2 by 
adding Gaussian noise Zi ~ M{0, 1) to output i, where the outputs of Channel 2 are given by (l26l ) 



yi 



|/in|^2-'=X2(A:) 



fc=i 



sign(/i 



12J 



\hi2\Y.^-'x2(k) 



k=l 



(27) 



and similarly for y2- 

Define the random variable s = [zi], where [ • ] is the nearest integer function. Observe that it is 
possible to recover yf^ from the pair (yf^ + z^ , s^). Lemma|5]gives that 

l/(xf;yf + .f)>l/(xf;yf)-i7(.). 
It remains only to derive a bound on the entropy of s, 

oo 

H{s) = - Y^ F{s = k)\ogF{s = k) 

k=—oo 

oo 

= -2 ^ P(s = k) log P(s = k) 
k=i 
-P(s = 0)logP(s = 0) 
< 1.5. 

Step 4: Remove truncation at noise level (lose log S bits). Let Channel 4 be the Gaussian channel 

yi = hiixi + hi2X2 + zi 
y2 = h2lXi + /122X2 + Z2 . 

The difference between Channels 3 and 4 is that signals received below the noise level are no longer 
truncated at the receivers. The output at receiver 1 is 

yi = hnxi + hi2X2 + zi = yi + xi + sign(/ii2)x2, 
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where yi is the output at receiver 1 in Channel 3 (|27] ) and xi, X2 are the magnitudes of the signals 
received below the noise level at receiver 1. 

The approach is similar to Step 3. Define the random variable 

s = [xi + sign(/ii2)£2] (28) 

where [ • ] is the nearest integer function. Each of xi, 2:2 is bounded between and 1 (since they are 
below the noise level), and so the random variable s can take at most 3 values. Hence the entropy 
of s is bounded as 

H{s) <log3. 

It is possible to recover y^ from the pair (y^, s^). Therefore Lemma |5] gives 

This completes the first direction of the proof. 

Remark 2. The above proof used the form of the capacity achieving input distribution. Thus, it 
does not follow that any capacity achieving distribution for the deterministic channel can simply be 
used with an outer code in the Gaussian channel. 

Remark 3. The final achievable strategy uses only positive, peak-power constrained inputs to the 
channel, which is obviously suboptimal. 

Al. Ccaussian ^ Cdet + (13.6, 13.6) 

Here we begin with the Gaussian channel and finish with the deterministic channel. Most of the steps 
are precisely the opposite as in the previous section. There is an important difference, however: the 
inputs to the Gaussian channel satisfy the less stringent average power constraint whereas the inputs 
to the deterministic channel must satisfy a peak power constraint. An extra step in the argument 
accounts for this difference. 

Step 1 removes the part of the input signals exceeding the peak power constraint; Step 2 trun- 
cates the signals at the noise level and removes the noise; Step 2' derives a single-letter expression 
for the capacity region of the channel in Step 2 and shows the near-optimality of uniformly dis- 
tributed inputs; Step 3 restricts the inputs and channel gains to positive numbers; Step 4 makes 
addition modulo 2; Step 5 quantizes the channel gains to the form 2"'^. 

Denote by Channel the original Gaussian interference channel, 

yi = hiixi + hi2X2 + zi 

y2 = h2lX2 + h22X2 + ^2 • 

Recall that we assumed a unit average power constraint 

1 " 

fc=i 

Step 1: Peak power constraint instead of average power constraint (lose 4 bits). The input- 
output relationship of Channel 1 is the same as Channel ( |29l ): 



Vi = hiixi + hi2X2 + Zi . (31) 
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The difference is that the inputs to Channel 1 satisfy a peak power constraint instead of an average 
power constraint: 

Xi <1. 



Writing the binary expansion of Xi, 



E x,{k)T 



we see that in Channel 1, Xi{k) = for A; < 0. 

Let Xi be an input to Channel 0, satisfying the average power constraint (l30l) . Let the part of the 
input that exceeds the peak power constraint be 



and let 




[xi] = sign(2;i) ^ Xi(/i:)2~'= 

fe=— oo 



Xi- Xi = sign(xi) E Xi{k)2~ 



k=l 

be the remaining signal. The signal Xi is defined so as to satisfy the peak power constraint. Finally, 
denote by iji the output at receiver i when the inputs are truncated to the peak power constraint, 

yi = hiixi + hi2X2 + Zi , 

and let 

yi = yi-yi = huxi + hi2X2 (32) 

be the output due to the inputs xi, i;2- 

To complete Step 1, we show that most of the mutual information I{xf ; yf) is preserved when 
the inputs are truncated to the peak power constraint. First, observe that since xi and X2 are inde- 
pendent, xf ,xf ,y^ form a Markov chain, xf — xf — yf. It follows that 

I{xf;yf\xf) = ^. 

Hence, from the data processing inequality and the mutual information chain rule we have 

= Iixf,xf-yf)+I{xf,xf-yf\yf) 

<I{xf;yf) + I{xf;yf\xf) + H{yf) 

= I{xf;yf) + H{yf) 

</(xf;yf) + F(xf) + F(x^). (33) 

The last inequality is a consequence of the fact that xi, £2 determine iji. It remains only to bound 
each of the entropy terms in (1331 ). 

Lemma 6. The following bound on the entropy holds 

H{x^) < 2N . (34) 
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Proof. The proof is based on the requirement that the part of xf exceeding the peak power con- 
straint, xf, itself must satisfy the average power constraint. Note that the entropy H{xf) does 
not depend on the channel gains at all. The part of the signal satisfying the peak power constraint, 
Xi, absorbs all the benefit from increasing the signal to noise ratio, as less significant bits from Xi 
appear above the noise level at the receiver. 

Two approaches are possible. The simpler approach is to observe that any scheme in the point- 
to-point deterministic channel with average power constraint can be used without modification in the 
Gaussian channel with power constraint P = 1, with a loss of at most 1.5 bits due to noise, by the 
argument in Step 3 of the previous subsection. The result then follows from the fact that the capacity 
of the point-to-point Gaussian channel with average power constraint P = 1 is | log(l + 1) = \. 
Thus, 

H{xf) < 2N. 

Alternatively, one may explicitly bound the number of possible values for xf using a combi- 
natorial argument. The first step is to notice that for each transmission at power 2™, it must hold 
that 2™ — 1 other time slots are silent. By writing a recursion in m and N on the number of 
possible signals of length N with peak power between 2™ and 2™^^, it is possible to bound the 
cardinality of the support of xf by poly(A^)c^ for a constant c and for all N, which shows that 
liuisup jj:H{x^) < c. D 



Plugging in the estimate (1341 ) from the Lemma into (1331) shows that at most 4 bits per user are 
lost in passing to a peak power constraint. 

Step 2: Truncate signals at noise level, remove fractional part of channel gains, and remove 
noise (lose 2.6 bits). The truncation at the noise level is not performed by solely taking the inte- 
ger part of a real- valued signal; instead, the binary expansion of each incoming signal is truncated 
appropriately, and only then do we take the integer part of each signal. In the final deterministic 
channel the two procedures are equivalent, so we choose this more convenient option with regards 
to the proof. The key benefit of this choice of truncation is the resulting clear distinction between 
common and private information, with the unintended receiver able to decode the common infor- 
mation. The derivation of the single-letter expression for the deterministic channel in Section|4]can 
then be applied without modification in Step 2'. 

We write the peak-power constrained channel inputs as 

oo 

Xi = sign{xi)^Xi{k)2-\ Xi{k)e {0,1}. (35) 

fc=i 

If [log h\ = n, then we deem as being above the noise level the component of hx arising from 
the n most significant bits in the binary expansion of x: 

n 

hs\gn{x)^2-^Xi{k). (36) 

k=l 

The magnitude of the part below the noise level can be bounded as 

oo 

\h\ Y^ 2-*^Xi(A:) < 2"+i2-" = 2. (37) 

k=n+l 

Channel 2 is defined by retaining only the part of the inputs above the noise level as described in 
(l36l ). taking the integer part of the channel gains, further taking the integer part of each observed 
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signal, and removing the noise. More specifically, receiver i observes the signal 



Vi 



nn 



[hii\Y,2-''xi{k) 



k=l 



+ 



ni2 



Ih2\^2-''x2{k) 



fc=l 



(38) 



Now, denote by Si the difference in the outputs relative to Channel 1, ignoring the additive 
Gaussian noise: 

£i ■ = yi-yi 



j /i,i sign(xi) Y^ 2-^xi{k) 



+ (^ii - L^iiJ ) sign(a;i) ^ 2-^xi {k) 



riii 



k=l 



+ frac sign(xi)[/iiij ^2 ^xi(/c) 
V k=i / 

f oo 

+ |/ii2sign(x2) Y^ 2-^X2{k) 

k=ni2+l 

ni2 

+ {hi2 - [hi2\ ) sign(x2) Y '^'^Mk) 

k=l 

+ fiacisign{x2)[hii\Y'^~''Mk)] \ + 



:= xi + X2 + Zi , 

where frac( • ) denotes the fractional part. Combining the estimate (|37] ) and the fact that \{hij — 
[hij\ )xj I < 1, we have 

\xi\<A, i = l,2. (39) 



We will later use the observation that xi,X2 ^^ £i forms a Gaussian MAC, and from (1391 ) the 
signal-to-noise ratio is at most 16 for each user. 
We show next that 

l/(xf;yf)+5.1>l/(xf;yf), 



where yi is the output of Channel 1 defined in (1311 ). Note that yi is independent of Zj. The data 
processing inequality and the chain rule allow to separate the contribution to the mutual information 
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I{xf \yf) from each term ef , yf: 



I{xf,yf) = Iixf;yf+ef) 



= I{x 

;yi 

I{x" 

;yi 

I{x" 






+ /(xf;^f|y-f) 



+ /i(ef)-/i(eny- 

+ /(£f,x^;ef) 
+ 2.6N , 



yf 



,Xi ,X2 ) 



Ni;r.N N N 
, Xi , X2 



where the last inequaUty holds for sufficiently large N. In the last step we used the fact that 
x\^X2 1-^ Si forms a Gaussian MAC with signal-to-noise ratio at most 16 for each transmitter, 

so j^l{x\,X2\£i) < \ log(l + 2(16)) + Eat (with ej^ -^ 0). This completes Step 2. 

Step 2': Single letter expression and near optimality of uniform input distribution (lose 2 bits). 

We now show that the derivation of SectionJH giving a single letter expression for the capacity region 
of the deterministic channel (fTTl) . applies to the channel of Step 2. Following this, we will prove that 
using uniformly distributed inputs incurs a loss of at most two bits per user relative to the optimal 
input distribution. 
Define 



"12 



X2c :=sign(x2)^2 ^X2(k), 



(40) 



fc=i 



and similarly for x\c. This is the part of the input that causes interference at the unintended receiver. 
Consider the signal that remains at receiver 1 after successfully decoding and subtracting off x\. 
From (|38] ). the remaining signal is 



/(a;2c) := Whx2\x2c 



".12 



sign(x2)L/ii2jX]2 ^X2{k) 



k=\ 



(41) 



The statement that / : supp(x2c) ^ Z is injective is equivalent to the claim that receiver 1 can 
recover X2c from /(x2c)- Now, viewed as a real number, the support of X2c has a spacing of 2~'*i2^ 
and since 

\hi2\ > 2"i2 , (42) 

the spacing of the support of [hi2\x2c is greater than 1. Hence the integer part [-J sends two 
different values of [/ii2ja;2c to two different integers, i.e. / is injective. An analogous argument 
shows that receiver 2 can recover xic- 

Since each receiver can recover the common portion of the interfering signal (l40l ). the arguments 
of Lemmas [2] and [3] in Section |4] apply without modification to the channel under scrutiny. Thus, the 
region is given by (l22l ). 

We now show that at most one bit per user is lost relative to the capacity region when each of 
the signals xic,xip,X2c,X2p is uniformly distributed on its support. We first prove a comparable 
result for random variables with support sets that are arithmetic progressions of integers. 
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Lemma 7. Let A, B & Z be two arithmetic progressions, 

A = {0, a, 2a, ... , {Ma - l)a] = [0, Ma -I] -a 
B = {0, 6, 26, ... , (Mb - 1)6} = [0, M^ - 1] • 6 . 

IfX and Y are independent and distributed uniformly on A and B, respectively, then 

H{X + y) + 1 > H{X* + Y*) (43) 

for any random variables X* , Y* with support sets A, B. 

Proof. Scaling the sets A and B by the same number does not change the relevant entropies, so we 
may assume without loss of generality that gcd(a, 6) = 1. We first estimate the cardinality of the 
sumset A + B = {a + b : a € A,b e B}. Note that 

A + BC{0,..., a{MA - 1) + KMb - 1)} , 

from which it follows that 

\A + B\<aMA + hMB- (44) 

Since supp(X* + y*) C ^ + 5, we therefore have the estimate 

H{X* + y*) < log(aM^ + hMB) . (45) 

Next we calculate the maximum probability mass in the distribution oi X + Y , 

p:= max F(X + Y = x). (46) 

x£A+B 

For each k with < A; < Mb — 1 let 

Sk:=A + kb= [0, Ma - 1] ■ a + kb . 

For k outside the interval [0, Mb — l],Sk is defined to be empty. A typical element of Sk n Sk' with 
k' < k can be written as 

qa + kb = q'a + /c'6, 

for some < g < Ma — 1 and < g' < Mb — 1. Rearranging, we have 

{k - k')b = [q' - q)a , 
which by the assumption gcd(a, 6) = 1 implies 

a\{k-k'). 

Thus 

S^ nSk' ^9 implies k = k' mod a . (47) 



Letting A and B be shifts of A and B so that a median point Ues at the origin, the maximum in ([46 
occurs at X = 0, and it can be seen from the condition ( |47l) that 

, ,, /Ma Mb 

\\x,y : X + y = 0,x ^ A,y & B}\ < min — — , 

V 6 a 
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Since for each x £ A,y e B, P{X = x) = I/Ma and P{Y = y) = I/Mb, and X and Y are 
independent, 



-logp=-log ^ P{X = x,Y = y) 

x^A,y£B 
x + j/ — 

\{x,y : x + y = Q,x £ A,y £ B}\ 



= -log 



MaMb 
MaMb 



mm(^,^; 



= max(log(aMA),log(MfB)) . 
Hence, from equation ( [451 ). 

i/(x + y) = - ^ p(x)iogp(x) 

- ~ X] P(^)logp 

x&A+B (48) 

> max(log(aMA),log(6MB)) 

> log(aMA + 6Mb) - 1 

> H{X* + y*) - 1 . 

This proves the lemma. D 

It is not difficult to extend the proof of the Lemma to show the near optimality of uniformly 
distributed inputs for the channel defined by (l38l) . Let 

riii 

U:=[hi\Y,'i-^^i{k) (49) 



and 



so that 



Also, let 



fc=i 



"i2 



V ■.= [hi2\Y,2-^X2{k), (50) 

fc=i 

y=\U\ + \V\ . 



A : = supp(C/) = {0, [ha\ ,..., [h^i\ (2"" - 1)} • 2""" , 
B : = supp(y) = {0, lh,2\ ,■■■, lhi2\ (2"»2 - 1)} • 2-"^^^ . 

Assume without loss of generality (by symmetry of the definitions of U and V) that nn > nj2- 
We will work with scaled, integer- valued versions of U and V: let 

A := 2"" 

and 

U:=AU, V:=AV. 
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Let Ma = A and Mb = 2"^2. The supports sets are 

i = {0, 1, . . . , (Ma - 1)} • L/iiiJ 

and 

S = {0,1,...,(Mb-1)}-A(LM2 



-ni2\ 



Correspondingly, the integer part of a number t is replaced by quantization to the greatest multiple 
of A less than or equal to t: 

Q{t) := A y^ 

In the notation of Lemma |7l the spacings in the sets A and B are, respectively, a = [/ijij and 
b = A([/ij2j2~'^'2). Proving the equivalent of Lemma |7] for Q{U) + Q{V) will imply the same 
result for y = \U\ + \y\ by the scale-invariance of discrete entropy. 
With this notation, we have analogously to (l44l) that 

The next step is to compute a bound on the maximum probability mass in Q{tJ) + Q{V), 

p* := maxP(Q(c7) + Q{V) = x) . 

X 

For any x, we have 

{ueU,v eV : Q{u) + Q{v) = x} C {u € U ,v e V : u + v e [x, x + 2A)} 

= IJ {ueU,v eV :u + v = X*}. 

x*€[x,x+2A) 

Thus 



p* < max Y^ F{U + V = x*) 

x*(^[x,x+2A) (52) 

< 2Ap, 



where p is defined in (1461) . Combining (1511 ) and (1521 ). the desired result now follows exactly as in 
equation (1481 ) of Lemma|71 giving that 

H{U + V) >H{U* + V*)-2. 

The near optimality of the uniform distribution applies to each entropy constraint in (l22l ). and 
thus each user loses at most 2 bits as claimed. 

Step 3: Positive inputs and channel gains (lose 2 bits). From Step 2', the uniform distribution is 
nearly optimal for Channel 2. Viewing the inputs as coming from a constellation in the real line, it 
is not hard to see that negating a cross gain does not change any of the output statistics, therefore 
preserving the mutual information. Similarly, each of the output entropies in (l22l) is reduced by at 
most 2 bits if the inputs are restricted to be positive. 

Step 4: Addition over F2 (lose 2 bits). Consider the binary expansion of the output. In switching 
to modulo 2 addition, every output bit that has some entropy when using real addition is uniformly 
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random, except possibly the two most significant bits that arise due to carry-overs. Thus, at most 
two bits are lost in each of the entropy constraints of (l22l ). 

Step 5: Channel gains of the form 2" (lose zero bits). Channel 5 is the deterministic channel 
(l20l ). The optimal input distribution is uniform and the mutual information is unchanged when the 
gains are quantized to the nearest power of 2. In fact, the capacities of the channel in Step 4 and the 
channel of Step 5 are identical. 

Al. Complex Gaussian IC 

The proof of Theorem [U in the generality of complex- valued gains and signals is very similar to the 
proof of Theorem [2] for the real- valued channel presented in Sections [A 1 . 1 and lALl We focus on the 
proof that 

Coaussian ^ Cdet + Constant ; 

the other direction follows by reversing the steps and using the argument for the real- valued channel, 
and is omitted. The eventual gap is 42 bits, roughly double that of the real-valued case. 
The complex Gaussian interference channel is given by 

yi = hiixi + hi2X2 + zi 

y2 = h2lXi + h22X2 + Z2 , 

where Zi ~ CAA(0, 1) and the channel inputs satisfy an average power constraint 

1 ^ 
-Y.^[xl,]<P,, i = l,2. 

fc=i 

By scaling the outputs, we may set Pi = 2 and Zi ~ CAA(0, 2). We assume without loss of generality 
that the cross gains have zero phase, i.e. Im(/ii2) = Im(/i2i) = 0, since each of the receivers may 
simply rotate the output appropriately. These assumptions allow to write the output of the channel 
as 

r) = & "/.«') (T) + if ." ) i'r) + (r ) • (=^> 

yiij V"ii "-11 / V^i^y V u "12/ \^2i J \^ii J 

and similarly for y2- Here R and / denote real and imaginary part, respectively, and zir, zu ~ 
AA(0,1). 

Step 1: Peak power constraint instead of average power constraint (lose 8 bits). The argument 
is almost identical to that of Step 1 in lAl.l We truncate the inputs, letting the part of the input Xj/j 
that exceeds the peak power constraint be 



XiR = [xiR\ = sign{xiR) ^ XiR(k)2~'' , 

k=—oo 

and let 

oo 

XiR = XiR - XiR = sign(xjij) ^ XiR{k)2~^ 

k=l 

be the remaining signal, with similar definitions for Xij with / replacing R. The signals XiR,Xij 
are defined so that Xi = XiR + jxn satisfies the peak power constraint of 2. Let j/j be the output at 
receiver i due to the truncated inputs. The development in Step 1 of I A 1.1 shows that 

/(xf ; yf ) < /(xf ; yf ) + F(xf ) + H{x^) . (54) 
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The estimate 

H{xf) < AN 

follows from the argument of Lemma |6l by translating an arbitrary strategy for a point-to-point 
deterministic channel to a corresponding Gaussian channel with SNR = 1, with a loss of at most 3 
bits (1.5 bits per complex dimension). The point-to-point Gaussian channel has capacity 1, giving 
the estimate. 

Step 2: Truncate signals at noise level, remove fractional part of channel gains, and remove 
noise (lose 5.1 bits). The argument repeats that of Step 2 in lAl.l and is omitted. 

Step 2': Single letter expression, decoupling of real and imaginary components, and near 
optimality of uniform input distribution (lose 6 bits). After decoding the message of the intended 
user, each receiver has a clear view of the common message of the interfering user. Thus, the 
capacity region of the channel of Step 2 is given by (|22l) . 

Next, using a similar argument to that in Step 2' for the real-valued case, it can be shown that 
i.i.d. uniformly distributed inputs are nearly optimal on a modified channel, with a loss of at most 4 
bits per user. The modified channel replaces the direct gain hfl with |/i^| + |/i(J, and sets hj- = 0. 
The support of the output is at least as large in the modified channel under uniformly distributed 
inputs, and moreover, the output is independent over time. Thus, this step decouples the real and 
imaginary components. The argument for the real-valued channel can now be applied to the real 
and imaginary components of the complex channel. 

Steps 3, 4, and 5: Positive inputs and channel gains (lose 4 bits), addition over F2 (lose 2 bits), 
channel gains of the form 2". Steps 3 and 4 are identical to the real-valued case. In Step 5 the 
direct gains |/i^| + |/ifj| are replaced with 2L^°s(l^"l+l'^"l)-l. Similarly, the cross gains \hi2\ and |/i^^| 
are replaced with 2^'°^ I'^wU and 2^^°^ I'^aiU , respectively. 

Step 6: Combine real and imaginary parallel channels (lose 4 bits). Now, the resulting deter- 
ministic channel from Step 5 is precisely the same as the deterministic channel in the real-valued 
case, but with twice as many channel uses (one each for the real and imaginary part of the signal). 
Hence the capacity region of the complex deterministic channel is the same as for the real-valued 
channel, but scaled by two. Note that the capacity region for the deterministic channel dTSl ) exactly 
doubles when all the channel gains are squared. We have 

22Liog(|h«|+|/ifJ)J < 2Li+i°s(l^^l'+l^'.l')J = 2i+Li°gSNR>J ^ 

which shows that changing the gain to 2L^°sSNRd changes at most one bit of the output in each 
complex dimension. Similarly, at most one bit of the output at receiver 1 is changed by changing 
the cross gain 2^Liog|'ii2lJ to 2L'°siNR2j Thus, at most 4 bits per user ai^e lost in making this final 
modification to the channel. 
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